{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cointegration Testing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_This setup code is required to run in an IPython notebook_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import seaborn\n",
    "\n",
    "seaborn.set_style(\"darkgrid\")\n",
    "plt.rc(\"figure\", figsize=(16, 6))\n",
    "plt.rc(\"savefig\", dpi=90)\n",
    "plt.rc(\"font\", family=\"sans-serif\")\n",
    "plt.rc(\"font\", size=14)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "We will look at the spot prices of crude oil measured in Cushing, OK for\n",
    "West Texas Intermediate Crude, and Brent Crude. The underlying data in this\n",
    "data set come from the [U.S. Energy Information Administration](https://www.eia.gov/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "from arch.data import crude\n",
    "\n",
    "data = crude.load()\n",
    "log_price = np.log(data)\n",
    "\n",
    "ax = log_price.plot()\n",
    "xl = ax.set_xlim(log_price.index.min(), log_price.index.max())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "We can verify these both of these series appear to contains unit roots\n",
    "using Augmented Dickey-Fuller tests. The p-values are large indicating\n",
    "that the null cannot be rejected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from arch.unitroot import ADF\n",
    "\n",
    "ADF(log_price.WTI, trend=\"c\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "ADF(log_price.Brent, trend=\"c\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "The Engle-Granger test is a 2-step test that first estimates a cross-sectional\n",
    "regression, and then tests the residuals from this regression using an\n",
    "Augmented Dickey-Fuller distribution with modified critical values. The cross-sectional regression is\n",
    "\n",
    "$$ Y_t = X_t \\beta + D_t \\gamma + \\epsilon_t $$\n",
    "\n",
    "where $Y_t$ and $X_t$ combine to contain the set of variables being tested for\n",
    "cointegration and $D_t$ are a set of deterministic regressors that might include\n",
    "a constant, a time trend, or a quadratic time trend. The trend is specified using\n",
    "`trend` as\n",
    "\n",
    "* `\"n\"`: No trend\n",
    "* `\"c\"`: Constant\n",
    "* `\"ct\"`: Constant and time trend\n",
    "* `\"ctt\"`: Constant, time and quadratic trends\n",
    "\n",
    "Here we assume that that cointegrating relationship is exact so that no\n",
    "deterministics are needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from arch.unitroot import engle_granger\n",
    "\n",
    "eg_test = engle_granger(log_price.WTI, log_price.Brent, trend=\"n\")\n",
    "eg_test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `plot` method can be used to plot the model residual.  We see that while this appears to be mean 0, it might have a trend in it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "fig = eg_test.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The estimated cointegrating vector is exposed through he `cointegrating_vector` property.  Here we see it is very close to $[1, -1]$, indicating a simple no-arbitrage relationship."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eg_test.cointegrating_vector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can rerun the test with both a constant and a time trend to see how this affects the conclusion. We firmly reject the null of no cointegration even with this alternative assumption."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "eg_test = engle_granger(log_price.WTI, log_price.Brent, trend=\"ct\")\n",
    "eg_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "eg_test.cointegrating_vector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The residuals are clearly mean zero but show evidence of a structural break around the financial crisis of 2008."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "fig = eg_test.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To investigate the changes in the 2008 financial crisis, we can re-run the test on only the pre-crisis period."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "eg_test = engle_granger(log_price[:\"2008\"].WTI, log_price[:\"2008\"].Brent, trend=\"n\")\n",
    "eg_test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These residuals look quite a bit better although it is possible the break in the cointegrating vector happened around 2005 when oil prices first surged."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "fig = eg_test.plot()\n",
    "ax = fig.get_axes()[0]\n",
    "title = ax.set_title(\"Pre-2009 Cointegration Residual\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phillips-Ouliaris\n",
    "\n",
    "The Phillips-Ouliaris tests consists four distinct tests.  Two are similar to the Engle-Granger test, only using a Phillips & Perron-like approach replaces the lags in the ADF test with a long-run variance estimator. The other two use variance-ratio like approaches to test.  In both cases the test stabilizes when there is no cointegration and diverges due to singularity of the covariance matrix of the I(1) time series when there is cointegration. \n",
    "\n",
    "* $Z_t$ - Like PP using the t-stat of the AR(1) coefficient in an AR(1) of the residual from the cross-sectional regression.\n",
    "* $Z_\\alpha$ - Like PP using $T(\\alpha-1)$ and a bias term from the same AR(1)\n",
    "* $P_u$ - A univariate variance ratio test. \n",
    "* $P_z$ - A multivariate variance ratio test.\n",
    "\n",
    "The four test statistics all agree on the crude oil data.\n",
    "\n",
    "The $Z_t$ and $Z_\\alpha$ test statistics are both based on the quantity $\\gamma=\\rho-1$ from the regression $y_t = d_t \\Delta + \\rho y_{t-1} + \\epsilon_t$. The null is rejected in favor of the alternative when $\\gamma<0$ so that the test statistic is _below_ its critical value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from arch.unitroot.cointegration import phillips_ouliaris\n",
    "\n",
    "po_zt_test = phillips_ouliaris(\n",
    "    log_price.WTI, log_price.Brent, trend=\"c\", test_type=\"Zt\"\n",
    ")\n",
    "po_zt_test.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "po_za_test = phillips_ouliaris(\n",
    "    log_price.WTI, log_price.Brent, trend=\"c\", test_type=\"Za\"\n",
    ")\n",
    "po_za_test.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The $P_u$ and $P_z$ statistics are variance ratios where under the null the numerator and denominator are balanced and so converge at the same rate. Under the alternative the denominator converges to zero and the statistic diverges, so that rejection of the null occurs when the test statistic is _above_ a critical value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "po_pu_test = phillips_ouliaris(\n",
    "    log_price.WTI, log_price.Brent, trend=\"c\", test_type=\"Pu\"\n",
    ")\n",
    "po_pu_test.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "po_pz_test = phillips_ouliaris(\n",
    "    log_price.WTI, log_price.Brent, trend=\"c\", test_type=\"Pz\"\n",
    ")\n",
    "po_pz_test.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The cointegrating residual is identical to the EG test since the first step is identical."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = po_zt_test.plot()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
